AITopics | anchor word

Collaborating Authors

anchor word

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Conic Scan-and-Cover algorithms for nonparametric topic modeling

Mikhail Yurochkin, Aritra Guha, XuanLong Nguyen

Neural Information Processing SystemsApr-23-2026, 13:58:39 GMT

We propose new algorithms for topic modeling when the number of topics is unknown. Our approach relies on an analysis of the concentration of mass and angular geometry of the topic simplex, a convex polytope constructed by taking the convex hull of vertices representing the latent topics. Our algorithms are shown in practice to have accuracy comparable to a Gibbs sampler in terms of topic estimation, which requires the number of topics be given. Moreover, they are one of the fastest among several state of the art parametric techniques.1 Statistical consistency of our estimator is established under some conditions.

algorithm, artificial intelligence, machine learning, (16 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Genre: Research Report (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)

Add feedback

Anchor-Free Correlated Topic Modeling: Identifiability and Algorithm

Kejun Huang, Xiao Fu, Nikolaos D. Sidiropoulos

Neural Information Processing SystemsApr-22-2026, 08:41:36 GMT

In topic modeling, many algorithms that guarantee identifiability of the topics have been developed under the premise that there exist anchor words - i.e., words that only appear (with positive probability) in one topic. Follow-up work has resorted to three or higher-order statistics of the data corpus to relax the anchor word assumption. Reliable estimates of higher-order statistics are hard to obtain, however, and the identification of topics under those models hinges on uncorrelatedness of the topics, which can be unrealistic. This paper revisits topic modeling based on second-order moments, and proposes an anchor-free topic mining framework. The proposed approach guarantees the identification of the topics under a much milder condition compared to the anchor-word assumption, thereby exhibiting much better robustness in practice. The associated algorithm only involves one eigendecomposition and a few small linear programs. This makes it easy to implement and scale up to very large problem instances. Experiments using the TDT2 and Reuters-21578 corpus demonstrate that the proposed anchor-free approach exhibits very favorable performance (measured using coherence, similarity count, and clustering accuracy metrics) compared to the prior art.

artificial intelligence, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country: North America > United States > Minnesota > Hennepin County > Minneapolis (0.28)

Industry: Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.95)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.47)

Add feedback

A Reduction for Efficient LDA Topic Reconstruction

Matteo Almanza, Flavio Chierichetti, Alessandro Panconesi, Andrea Vattani

Neural Information Processing SystemsFeb-15-2026, 03:00:05 GMT

Neural Information Processing Systems http://nips.cc/

algorithm, reconstruction, topic reconstruction, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > Italy > Lazio > Rome (0.04)
North America > Canada > Quebec > Montreal (0.04)
(2 more...)

Genre: Research Report (0.48)

Technology: Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.34)

Add feedback

Multilingual Anchoring: Interactive Topic Modeling and Alignment Across Languages

Michelle Yuan, Benjamin Van Durme, Jordan L. Ying

Neural Information Processing SystemsFeb-12-2026, 11:38:41 GMT

Neural Information Processing Systems http://nips.cc/

anchor word, multilingual, topic model, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > Maryland (0.04)
Asia > Middle East > Jordan (0.04)
North America > United States > California (0.04)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Communications (0.96)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.54)

Add feedback

Multilingual Anchoring: Interactive Topic Modeling and Alignment Across Languages

Michelle Yuan, Benjamin Van Durme, Jordan L. Ying

Neural Information Processing SystemsNov-20-2025, 23:23:28 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, machine learning, natural language, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > Maryland (0.04)
Asia > Middle East > Jordan (0.04)
North America > United States > California (0.04)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Communications (0.96)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.34)

Add feedback

A Reduction for Efficient LDA Topic Reconstruction

Matteo Almanza, Flavio Chierichetti, Alessandro Panconesi, Andrea Vattani

Neural Information Processing SystemsNov-20-2025, 20:58:54 GMT

We present a novel approach for LDA (Latent Dirichlet Allocation) topic reconstruction.

algorithm, artificial intelligence, natural language, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > Italy > Lazio > Rome (0.04)
North America > Canada > Quebec > Montreal (0.04)
(2 more...)

Genre: Research Report (0.48)

Technology: Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.34)

Add feedback

On some provably correct cases of variational inference for topic models

Pranjal Awasthi, Andrej Risteski

Neural Information Processing SystemsOct-2-2025, 08:06:56 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > New Jersey > Middlesex County > New Brunswick (0.04)
North America > United States > New Jersey > Mercer County > Princeton (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.52)

Add feedback

Robust Spectral Inference for Joint Stochastic Matrix Factorization

Moontae Lee, David Bindel, David Mimno

Neural Information Processing SystemsOct-2-2025, 06:18:55 GMT

Spectral inference provides fast algorithms and provable optimality for latent topic analysis. But for real data these algorithms require additional ad-hoc heuristics, and even then often produce unusable results. We explain this poor performance by casting the problem of topic inference in the framework of Joint Stochastic Matrix Factorization (JSMF) and showing that previous methods violate the theoretical conditions necessary for a good solution to exist. We then propose a novel rectification method that learns high quality topics and their interactions even on small, noisy data. This method achieves results comparable to probabilistic techniques in several domains while maintaining scalability and provable optimality.

algorithm, matrix, projection, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > New York > Tompkins County > Ithaca (0.04)
Asia > Middle East > Jordan (0.04)
North America > United States > Nevada (0.04)

Industry:

Leisure & Entertainment (0.68)
Media > Film (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Prompt Tuning for Few-Shot Continual Learning Named Entity Recognition

Ren, Zhe

arXiv.org Artificial IntelligenceAug-12-2025

Knowledge distillation has been successfully applied to Continual Learning Named Entity Recognition (CLNER) tasks, by using a teacher model trained on old-class data to distill old-class entities present in new-class data as a form of regularization, thereby avoiding catastrophic forgetting. However, in Few-Shot CLNER (FS-CLNER) tasks, the scarcity of new-class entities makes it difficult for the trained model to generalize during inference. More critically, the lack of old-class entity information hinders the distillation of old knowledge, causing the model to fall into what we refer to as the Few-Shot Distillation Dilemma. In this work, we address the above challenges through a prompt tuning paradigm and memory demonstration template strategy. Specifically, we designed an expandable Anchor words-oriented Prompt Tuning (APT) paradigm to bridge the gap between pre-training and fine-tuning, thereby enhancing performance in few-shot scenarios. Additionally, we incorporated Memory Demonstration Templates (MDT) into each training instance to provide replay samples from previous tasks, which not only avoids the Few-Shot Distillation Dilemma but also promotes in-context learning. Experiments show that our approach achieves competitive performances on FS-CLNER.

entity type, machine learning, natural language, (13 more...)

arXiv.org Artificial Intelligence

2508.07248

Country:

Europe (0.68)
Asia (0.68)

Genre: Research Report (0.82)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Testing Hypotheses of Covariate Effects on Topics of Discourse

Phelan, Gabriel, Campbell, David A.

arXiv.org Machine LearningJun-9-2025

We introduce an approach to topic modelling with document-level covariates that remains tractable in the face of large text corpora. This is achieved by de-emphasizing the role of parameter estimation in an underlying probabilistic model, assuming instead that the data come from a fixed but unknown distribution whose statistical functionals are of interest. We propose combining a convex formulation of non-negative matrix factorization with standard regression techniques as a fast-to-compute and useful estimate of such a functional. Uncertainty quantification can then be achieved by reposing non-parametric resampling methods on top of this scheme. This is in contrast to popular topic modelling paradigms, which posit a complex and often hard-to-fit generative model of the data. We argue that the simple, non-parametric approach advocated here is faster, more interpretable, and enjoys better inferential justification than said generative models. Finally, our methods are demonstrated with an application analysing covariate effects on discourse of flavours attributed to Canadian beers.

covariate, machine learning, natural language, (18 more...)

arXiv.org Machine Learning

2506.0557

Country: